Linear Models and Summary
2025-07-20
An Ode to HaroldWhat sort of data do you work with that have multiple units observed over time?
Fortunately, the computer can sort of handle this but the structure has to be declared. Stata has ts and R has the tsibble structure that I will use among many others.
A matrix is a rectangular array of real numbers. If it has m rows and n columns, we say it is of dimension m\times n.
A = \left( \begin{array}{ccccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array} \right)
A vector (x_{1}, x_{2}, \ldots, x_{n}) \in \mathbb{R}^{n} can be thought as as a matrix with n rows and one column or as a matrix with one row and n columns.
Periodicially, we will wish to make use of two types of vector products.
\mathbf{uv}^{\prime} = \left( \begin{array}{ccccc} u_{1}v_{1} & u_{1} v_{2} & \ldots & u_{1} v_{n} \\ u_{2}v_{1} & u_{2}v_{2} & \ldots & u_{2}v_{n} \\ \vdots & \vdots & \ddots & \vdots \\ u_{m}v_{1} & u_{m}v_{2} & \ldots & u_{m} v_{n} \end{array} \right)
There are two primary methods for inverting matrices. The first is often referred to as Gauss-Jordan elimination and the second is known as Cramer’s rule. The former involves a series of elementary row operations undertaken on the matrix of interest and equally to I while the latter relies on a determinant and the adjoint of the matrix of interest.
Let A and B be square invertible matrices. It follows that:
For a matrix A, the following are equivalent: 1. A is invertible. 2. A is nonsingular. 3. For all y, Ax=y has a unique solution. 4. \det A \neq 0 and A is square.
Given a square matrix A and a vector \mathbf{x}, we can claim that - A is negative definite if, \forall x \neq 0,\; x^{T}Ax < 0 - A is positive definite if, \forall x \neq 0,\; x^{T}Ax > 0 - A is negative semi-definite if, \forall x \neq 0,\; x^{T}Ax \leq 0 - A is positive semi-definite if, \forall x \neq 0,\; x^{T}Ax \geq 0
Brief diversion on principal submatrices and (leading) principal minors toward a sufficient condition for characterizing definiteness.
– NB: The trace of a square matrix is the sum of its diagonal elements.
We needed this to:
Random Variables: Real-valued function with domain: a sample space.
Mean (Expected Value): E[x] = \int_{x} x\; f(x)\; dx or E[x] = \sum_{x} x\; p(x)
Variance (Spread): V[x] = E(x^2) - [E(x)]^2
Covariance: E[xy] = E[(x_{i} - [E(x)])(y_{i} - [E(y)])]
Correlation: \rho = \frac{E[(x_{i} - [E(x)])(y_{i} - [E(y)])]}{\sigma_{x}\sigma_{y}} = \frac{Cov(XY)}{\sqrt{\sigma^{2}_{x}}\sqrt{\sigma^{2}_{y}}}
Variance of Linear Combination: \sum_{i}\sum_{j} a_{i}a_{j}Cov(X_{i}X_{j})
A random variable X has a (\mu \in \mathbb{R} and \sigma^{2} \in \mathbb{R}^{++}) if X has a continuous distribution for which the probability density function (p.d.f.) f(x|\mu,\sigma^2 ) is as follows (for finite x): f(x|\mu,\sigma^2 ) = \frac{1}{\sqrt{2\pi\sigma^{2}}} \exp \left[ -\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^{2}\right] - If X \sim N(\mu,\sigma^2), Let Y = aX + b (a \neq 0). Y \sim N(a\mu + b, a^{2}\sigma^{2}). - Z \sim N(0,1) allow the percentiles for any normal: Z = \frac{x - \mu}{\sigma}. - Sums of (independent) normals are normal. - Sums of affine transformations of (independent) normals are normal.
Let X_1, X_2, \ldots, X_n be independent, identically distributed normal random variables with mean \mu and variance \sigma^2. With respect to \mu, \widehat{\mu}=\frac{\sum X_i}{n}, the sample mean is a complete sufficient statistic – it is informationally optimal to estimate \mu. \widehat{\sigma}^2=\frac{\sum \left(X_i-\bar{X}\right)^2}{n-1}, the sample variance, is an ancillary statistic – its distribution does not depend on \mu.
These statistics are independent (also can be proven by Cochran’s theorem). This property (that the sample mean and sample variance of the normal distribution are independent) characterizes the normal distribution; no other distribution has this property.
If a random variable X has a then the probability density function of X (given x > 0 is f(x) = \frac{1}{2^{\frac{n}{2}}\Gamma(\frac{n}{2})}x^{(\frac{n}{2}) - 1}\exp\left(\frac{-x}{2}\right)
Two key properties: 1. If the random variables X_{1},\ldots,X_{k} are independent and if X_{i} has a \chi^2 distribution with n_{i} degrees of freedom, then \sum_{i=1}^{k} X_{i} has a \chi^{2} distribution with \sum_{i=1}^{k} n_{i} degrees of freedom.
2. If the random variables X_{1},\ldots,X_{k} are independent and, \forall i: X_{i} \sim N(0,1), then \sum_{i=1}^{k} X^{2}_{i} has a \chi^{2} distribution with k degrees of freedom.
Consider two independent random variables Y and Z, such that Y has a \chi^{2} distribution with n degrees of freedom and Z has a standard normal distribution. If we define X = \frac{Z}{\sqrt{\frac{Y}{n}}} then the distribution of X is called the t distribution with n degrees of freedom. The t has density (for finite x) f(x) = \frac{\Gamma\left(\frac{n+1}{2}\right)}{(n\pi)^{\frac{1}{2}}\Gamma\left(\frac{n}{2}\right)}\left(1 + \frac{x^{2}}{n}\right)^{\frac{-(n+1)}{2}}
Consider two independent random variables Y and W, such that Y has a \chi^{2} distribution with m degrees of freedom and W has a \chi^{2} distribution with n degrees of freedom, where m, n \in \mathbb{R}^{++}. We can define a new random variable X as follows: X = \frac{\frac{Y}{m}}{\frac{W}{n}} = \frac{nY}{mW} then the distribution of X is called an F distribution with m and n degrees of freedom.
This gets us through the background to this point. We will invoke parts of this as we go from here. We have matrices and ther inverses. We have some distributional results that link the normal, \chi^2, t, and F. We have a theorem regarding and the normal. This gives us the intuition for Gauss-Markov. Nevertheless, let’s begin the meat of it all: regression.
One of the features of the Ordinary Least Squares Estimator is the orthogonality (independence) of the estimation space and the error space.
Now we want to reexamine the minimization of the sum of squared errors in a matrix setting. We wish to minimize the inner product of \epsilon^{\prime}\epsilon. \epsilon^{\prime}\epsilon = (y - X\beta)^{\prime}(y - X\beta) = y^{\prime}y - y^{\prime}X\beta - \beta^{\prime}X^{\prime}y + \beta^{\prime}X^{\prime}X\beta
Take the derivative, set it equal to zero, and solve…. \frac{\partial \epsilon^{\prime}\epsilon}{\partial \beta} = -2X^{\prime}y + 2X^{\prime}X\beta = 0 X^{\prime}y = X^{\prime}X\beta
So we rearrange to obtain the solution in matrix form.
\hat{\beta}_{OLS} = (X^{\prime}X)^{-1}X^{\prime}y
Need nothing about the distribution other than the two moment defintions. It is for number three that this starts to matter and, in many ways, this is directly a reflection of Basu’s theorem.
With \hat{\beta}=(X^{\prime}X)^{-1}X^{\prime}y, \mathbb{E}[\hat{\beta} - \beta] = 0 requires, \mathbb{E}[(X^{\prime}X)^{-1}X^{\prime}y - \beta] = 0 We require an inverse already. Invoking the definition of y, we get \mathbb{E}[\mathbf{(X^{\prime}X)^{-1}X^{\prime}}(\mathbf{X}\beta + \epsilon) - \beta] = 0 \mathbb{E}[\mathbf{(X^{\prime}X)^{-1}X^{\prime}}\mathbf{X}\beta + \mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon - \beta] = 0 Taking expectations and rearranging. \hat{\beta} - \beta = -\mathbb{E}[\mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon] If the latter multiple is zero, all is well.
\mathbb{E}[(\hat{\mathbf{\beta}} - \beta)(\hat{\mathbf{\beta}} - \beta)^{\prime}] can be derived as follows. \mathbb{E}[(\mathbf{(X^{\prime}X)^{-1}X^{\prime}}\mathbf{X}\beta + \mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon - \beta)(\mathbf{(X^{\prime}X)^{-1}X^{\prime}}\mathbf{X}\beta + \mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon - \beta)^{\prime}] \mathbb{E}[(\mathbf{I}\beta + \mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon - \beta)(\mathbf{I}\beta + \mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon - \beta)^{\prime}] Recognizing the zero part from before, we are left with the manageable,
\mathbb{E}[(\hat{\mathbf{\beta}} - \beta)(\hat{\mathbf{\beta}} - \beta)^{\prime}] \mathbb{E}[\mathbf{(X^{\prime}X)^{-1}X^{\prime}}\epsilon\epsilon^{\prime}\mathbf{X(X^{\prime}X)^{-1}}] Nifty. With nonstochastic \mathbf{X}, it’s the structure of \epsilon\epsilon^{\prime} and we know what that is. By assumption, we have \sigma^{2}\mathbf{I}. If stochastic, we need more steps to get to the same place.
Proving the Gauss-Markov theorem is not so instructive. From what we already have, we are restricted to linear estimators, we add or subtract something. So after computation, we get the OLS standard errors plus a positive semi-definite matrix. OLS always wins. From here, a natural place to go is corrections for non \mathbf{I}. We will do plenty of that. And we will eventually need Aitken.
Beyond this, lets take up two special matrices (that will be your favorite matrices):
Projection Matrix : \mathbf{X}(\mathbf{X}^{\prime}\mathbf{X})^{-1}\mathbf{X}^{\prime}
Residual Maker : \mathbf{I} - \mathbf{X}(\mathbf{X}^{\prime}\mathbf{X})^{-1}\mathbf{X}^{\prime}
which are both symmetric and idempotent (\mathbf{M}^{2}=\mathbf{M}).
M \mathbf{M} = \mathbf{I} - \mathbf{X(X^\prime X)^{-1}X^\prime} \mathbf{My} = (\mathbf{I} - \mathbf{X(X^\prime X)^{-1}X^\prime})\mathbf{y} \mathbf{My} = \mathbf{Iy} - \mathbf{X\underbrace{(X^\prime X)^{-1}X^\prime \mathbf{y}}_{\hat{\beta}}} \mathbf{My} = \mathbf{y} - \mathbf{X\hat{\beta}} \mathbf{My} = \hat{\epsilon}
P
\mathbf{P} = \mathbf{I - M} \mathbf{P} = \mathbf{I - (I - X(X^\prime X)^{-1}X^\prime}) \mathbf{P} = \mathbf{X(X^\prime X)^{-1}X^\prime} \mathbf{Py} = \mathbf{X\underbrace{(X^\prime X)^{-1}X^\prime\mathbf{y}}_{\hat{\beta}}} \mathbf{Py} = \mathbf{X\hat{\beta}} \mathbf{Py} = \hat{\mathbf{y}}
y = X\beta + \epsilon
y = X\beta + \epsilon
The latter F test appears in standard regression output and the probability compares a model with only a constant to the model you have estimated with H_0: Constant only. \ \ If we just like F better than t, it happens that (t_{\nu})^2 \sim F_{1,\nu}, but t has the advantage of being (potentially) one-sided.
In forming confidence intervals, one must account for metrics. t is defined by a standard deviation metric and the standard deviation remains in a common metric with the parameter \hat{\beta}. Variance represents a squared metric in terms of the units measured by \hat{\beta}. As a result, we will form confidence intervals from standard deviations instead of variances.
Prediction intervals: A future response: \hat{y}_{0} \pm t^{(\frac{\alpha}{2})}_{n-p}\hat{\sigma}\sqrt{1 + x^{\prime}_{0}(\mathbf{X^{\prime}X})^{-1}x_{0}}
A mean response: \hat{y}_{0} \pm t^{(\frac{\alpha}{2})}_{n-p}\hat{\sigma}\sqrt{x^{\prime}_{0}(\mathbf{X^{\prime}X})^{-1}x_{0}}
Suppose that a correct specification is \mathbf{y} = \mathbf{X_{1}\beta_{1}} + \mathbf{X_{2}\beta_{2}} + \mathbf{\epsilon} where \mathbf{X_{1}} consists of k_{1} columns and \mathbf{X_{2}} consists of k_{2} columns. Regress just \mathbf{X_{1}} on \mathbf{y} without including \mathbf{X_{2}}, we can characterize \mathbf{b_{1}}, \mathbf{b_{1}} = (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X_{1}y} \rightarrow (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}[\mathbf{X_{1}\beta} + \mathbf{X_{2}\beta_{2}} + \mathbf{\epsilon}] = (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{X_{1}\beta} + (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{X_{2}\beta_{2}} + (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{\epsilon} = \mathbf{\beta_{1}} + (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{X_{2}\beta_{2}} + (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{\epsilon} Two elements worthy of consideration.
1. If \mathbf{\beta_{2}}=0 everything is fine, assuming the standard assumptions hold. The reason: we have not really misspecified the model. 2. Also, if the standard assumptions hold and \mathbf{X_{1}^{\prime}X_{2}}=0 then the second term also vanishes (even though \mathbf{\beta_{2}}\neq 0). If, on the other hand, neither of these conditions hold, but we estimate the regression in any case, the estimate of \mathbf{b_{1}} will be biased by a factor of (defining \mathbf{P}= (\mathbf{X_{1}^{\prime}X_{1}})^{-1}\mathbf{X^{\prime}_{1}}\mathbf{X_{2}})
\mathbf{P}_{X_{1}X_{2}}\mathbf{\beta_{2}} What is \mathbf{P}_{X_{1}X_{2}}?
In specifying a regression model, we assume that its assumptions apply equally well to all the observations in our sample. They may not. Fortunately, we can test claims of structural stability using techniques that we already have encountered. H_0 : Structural stability.
Now we can move on to considering the properties of the residuals and there conformity with assumptions we have made about their properties.
(sum , detail) JB = \frac{n}{6}\left(S^{2} + \frac{(K-3)^{2}}{4} \right)Two types of models, in general 1. Nested models 2. Nonnested models
In layman’s terms, nested models arise when one model is a special case of the other. For example, \mathbf{y} = \mathbf{\beta_{0}} + \mathbf{X_{1}\beta_{1}} + \mathbf{\epsilon} is nested in \mathbf{y} = \mathbf{\beta_{0}} + \mathbf{X_{1}\beta_{1}} + \mathbf{X_{2}\beta_{2}} + \mathbf{\epsilon} using the restriction that \mathbf{\beta_{2}} = 0. If models are nested, usual techniques can be used. If not, we must turn to alternative tools. Technically, there is probably an intermediate class that would be appropriately named overlapping. Practically, overlapping have some nested elements and some nonnested elements. Almost always, we will need the nonnested tools for these.
Consider i \in N units observed at t \in T points in time. The normal cross-sectional data structure will use variables as columns and i as rows. Panel data then adds the complication of a third dimension, t. If we were to take this third dimension and transform the array to two dimensions, we would end up with an (N \times T) by K matrix of covariates and (for a single outcome) an NT vector.
Hsiao isolates many of the central issues in panel data from the view of an econometrician. The argument is a bit broader in the sense of repetition.
Berk and Freedman isolate important issues of particular relevance to the types of structures we will look at.
Given two-dimensional data, how should we break it down? The most common method is unit-averages; we break each unit’s time series on each element into deviations from their own mean. This is called the within transform. The between portion represents deviations between the unit’s mean and the overall mean. Stationarity considerations are generically implicit. We will break this up later.
xt commandsIn Stata’s language, xt is the way that one naturally refers to CSTS/TSCS data. Consider NT observations on some random variable y_{it} where i \in N and t \in T. The TSCS/CSTS commands almost always have this prefix.
xtset: Declaring xt dataxtdes: Describing xt data structurextsum: Summarizing xt dataxttab: Summarizing categorical xt data.xttrans: Transition matrix for xt data.library(haven)
HR.Data <- read_dta(url("https://github.com/robertwwalker/DADMStuff/raw/master/ISQ99-Essex.dta"))
library(skimr)
skim(HR.Data) %>% kable() %>% scroll_box(width="80%", height="50%")| skim_type | skim_variable | n_missing | complete_rate | numeric.mean | numeric.sd | numeric.p0 | numeric.p25 | numeric.p50 | numeric.p75 | numeric.p100 | numeric.hist |
|---|---|---|---|---|---|---|---|---|---|---|---|
| numeric | IDORIGIN | 0 | 1.0000000 | 446.7178771 | 243.1931782 | 2.00 | 290.000 | 435.000 | 640.00 | 990.00 | ▆▇▇▆▂ |
| numeric | YEAR | 0 | 1.0000000 | 1984.5000000 | 5.1889328 | 1976.00 | 1980.000 | 1984.500 | 1989.00 | 1993.00 | ▇▆▇▆▇ |
| numeric | AI | 1061 | 0.6707014 | 2.7533549 | 1.0752989 | 1.00 | 2.000 | 3.000 | 3.00 | 5.00 | ▃▇▇▃▂ |
| numeric | SD | 587 | 0.8178150 | 2.2406072 | 1.1303528 | 1.00 | 1.000 | 2.000 | 3.00 | 5.00 | ▇▇▆▂▁ |
| numeric | POLRT | 382 | 0.8814401 | 3.8095070 | 2.2230297 | 1.00 | 2.000 | 3.000 | 6.00 | 7.00 | ▇▂▂▁▇ |
| numeric | MIL2 | 382 | 0.8814401 | 0.2725352 | 0.4453421 | 0.00 | 0.000 | 0.000 | 1.00 | 1.00 | ▇▁▁▁▃ |
| numeric | LEFT | 393 | 0.8780261 | 0.1763874 | 0.3812168 | 0.00 | 0.000 | 0.000 | 0.00 | 1.00 | ▇▁▁▁▂ |
| numeric | BRIT | 290 | 0.9099938 | 0.3553888 | 0.4787126 | 0.00 | 0.000 | 0.000 | 1.00 | 1.00 | ▇▁▁▁▅ |
| numeric | PCGNP | 443 | 0.8625078 | 3591.6509536 | 5698.3554010 | 52.00 | 390.000 | 1112.000 | 3510.00 | 36670.00 | ▇▁▁▁▁ |
| numeric | AINEW | 468 | 0.8547486 | 2.4433551 | 1.1558005 | 1.00 | 1.000 | 2.000 | 3.00 | 5.00 | ▇▇▇▃▂ |
| numeric | SDNEW | 468 | 0.8547486 | 2.2618010 | 1.1365604 | 1.00 | 1.000 | 2.000 | 3.00 | 5.00 | ▇▇▆▂▁ |
| numeric | IDGURR | 0 | 1.0000000 | 455.7709497 | 246.5201369 | 2.00 | 290.000 | 450.000 | 663.00 | 990.00 | ▆▇▇▇▃ |
| numeric | AILAG | 644 | 0.8001241 | 2.4499612 | 1.1479673 | 1.00 | 1.000 | 2.000 | 3.00 | 5.00 | ▇▇▇▃▂ |
| numeric | SDLAG | 644 | 0.8001241 | 2.2470908 | 1.1156632 | 1.00 | 1.000 | 2.000 | 3.00 | 5.00 | ▇▇▆▂▁ |
| numeric | PERCHPCG | 618 | 0.8081937 | 4.6138441 | 13.2208934 | -95.50 | -2.545 | 4.615 | 11.76 | 128.57 | ▁▂▇▁▁ |
| numeric | PERCHPOP | 293 | 0.9090627 | 2.1928815 | 4.0424128 | -48.45 | 0.910 | 2.220 | 2.94 | 126.01 | ▁▇▁▁▁ |
| numeric | LPOP | 115 | 0.9643079 | 15.4819279 | 1.8633316 | 11.00 | 14.510 | 15.590 | 16.64 | 20.89 | ▂▃▇▃▁ |
| numeric | PCGTHOU | 443 | 0.8625078 | 3.5916985 | 5.6983334 | 0.05 | 0.390 | 1.110 | 3.51 | 36.67 | ▇▁▁▁▁ |
| numeric | DEMOC3 | 793 | 0.7538796 | 3.6817620 | 4.3577178 | 0.00 | 0.000 | 0.000 | 9.00 | 10.00 | ▇▁▁▂▃ |
| numeric | CWARCOW | 407 | 0.8736809 | 0.0920071 | 0.2890873 | 0.00 | 0.000 | 0.000 | 0.00 | 1.00 | ▇▁▁▁▁ |
| numeric | IWARCOW2 | 380 | 0.8820608 | 0.0862069 | 0.2807187 | 0.00 | 0.000 | 0.000 | 0.00 | 1.00 | ▇▁▁▁▁ |
library(tidyverse)
library(plm)
source(url("https://raw.githubusercontent.com/robertwwalker/DADMStuff/master/xtsum/xtsum.R"))
# Be careful with the ID variable, the safest is to make it factor; this can go wildly wrong
xtsum(IDORIGIN~., data=HR.Data) %>% kable() %>% scroll_box(width="80%", height="50%")| O.mean | O.sd | O.min | O.max | O.SumSQ | O.N | B.mean | B.sd | B.min | B.max | B.Units | B.t.bar | W.sd | W.min | W.max | W.SumSQ | Within.Ovr.Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YEAR | 1984.5 | 5.189 | 1976 | 1993 | 86725.5 | 3222 | 1984.5 | 0 | 1984.5 | 1984.5 | 179 | 18 | 5.189 | -8.5 | 8.5 | 86725.5 | 1 |
| AI | 2.753 | 1.075 | 1 | 5 | 2497.538 | 2161 | 2.498 | 0.989 | 1 | 5 | 173 | 12.491 | 0.631 | -2.375 | 2.5625 | 860.822 | 0.345 |
| SD | 2.241 | 1.13 | 1 | 5 | 3365.455 | 2635 | 2.241 | 1.004 | 1 | 5 | 178 | 14.803 | 0.624 | -2.666667 | 3.0625 | 1025.695 | 0.305 |
| POLRT | 3.81 | 2.223 | 1 | 7 | 14029.94 | 2840 | 3.78 | 1.99 | 1 | 7 | 179 | 15.866 | 0.925 | -4 | 4.777778 | 2428.552 | 0.173 |
| MIL2 | 0.273 | 0.445 | 0 | 1 | 563.058 | 2840 | 0.24 | 0.377 | 0 | 1 | 179 | 15.866 | 0.216 | -0.9444444 | 0.8888889 | 132.778 | 0.236 |
| LEFT | 0.176 | 0.381 | 0 | 1 | 410.983 | 2829 | 0.157 | 0.334 | 0 | 1 | 179 | 15.804 | 0.157 | -0.8888889 | 0.8888889 | 69.611 | 0.169 |
| BRIT | 0.355 | 0.479 | 0 | 1 | 671.685 | 2932 | 0.335 | 0.473 | 0 | 1 | 179 | 16.38 | 0 | 0 | 0 | 0 | 0 |
| PCGNP | 3591.651 | 5698.355 | 52 | 36670 | 90205144379 | 2779 | 3449.178 | 5049.297 | 112.2222 | 22653.89 | 173 | 16.064 | 2278.412 | -12303.33 | 16961.67 | 14421042273 | 0.16 |
| AINEW | 2.443 | 1.156 | 1 | 5 | 3677.663 | 2754 | 2.379 | 1.012 | 1 | 5 | 178 | 15.472 | 0.622 | -2.388889 | 2.944444 | 1064.102 | 0.289 |
| SDNEW | 2.262 | 1.137 | 1 | 5 | 3556.241 | 2754 | 2.253 | 1.006 | 1 | 5 | 178 | 15.472 | 0.631 | -2.588235 | 3 | 1096.442 | 0.308 |
| IDGURR | 455.771 | 246.52 | 2 | 990 | 195747185 | 3222 | 455.771 | 247.173 | 2 | 990 | 179 | 18 | 0 | 0 | 0 | 0 | 0 |
| AILAG | 2.45 | 1.148 | 1 | 5 | 3396.045 | 2578 | 2.402 | 1.039 | 1 | 5 | 177 | 14.565 | 0.609 | -2.411765 | 3 | 955.37 | 0.281 |
| SDLAG | 2.247 | 1.116 | 1 | 5 | 3207.603 | 2578 | 2.236 | 0.991 | 1 | 5 | 177 | 14.565 | 0.608 | -2.5 | 3.058824 | 952.174 | 0.297 |
| PERCHPCG | 4.614 | 13.221 | -95.5 | 128.57 | 454983.6 | 2604 | 3.325 | 6.893 | -36.21333 | 15.03765 | 168 | 15.5 | 12.393 | -92.50235 | 114.8882 | 399763 | 0.879 |
| PERCHPOP | 2.193 | 4.042 | -48.45 | 126.01 | 47846.75 | 2929 | 2.842 | 9.443 | -2.126471 | 126.01 | 176 | 16.642 | 3.018 | -48.12235 | 80.69765 | 26663.59 | 0.557 |
| LPOP | 15.482 | 1.863 | 11 | 20.89 | 10784.05 | 3107 | 15.488 | 1.844 | 11.09056 | 20.76889 | 177 | 17.554 | 0.129 | -0.7288889 | 0.7311111 | 51.883 | 0.005 |
| PCGTHOU | 3.592 | 5.698 | 0.05 | 36.67 | 90204.45 | 2779 | 3.449 | 5.049 | 0.1122222 | 22.65389 | 173 | 16.064 | 2.278 | -12.30333 | 16.96167 | 14420.95 | 0.16 |
| DEMOC3 | 3.682 | 4.358 | 0 | 10 | 46107 | 2429 | 3.774 | 3.96 | 0 | 10 | 155 | 15.671 | 1.726 | -7.277778 | 7.941176 | 7229.815 | 0.157 |
| CWARCOW | 0.092 | 0.289 | 0 | 1 | 235.17 | 2815 | 0.095 | 0.245 | 0 | 1 | 179 | 15.726 | 0.175 | -0.8888889 | 0.9444444 | 85.693 | 0.364 |
| IWARCOW2 | 0.086 | 0.281 | 0 | 1 | 223.879 | 2842 | 0.092 | 0.227 | 0 | 1 | 179 | 15.877 | 0.19 | -0.8888889 | 0.9444444 | 102.992 | 0.46 |
In R, this is an essential group_by calculation in the tidyverse. The within data is the overall data with group means subtracted.
# A tibble: 18 × 4
# Groups: IDORIGIN [1]
IDORIGIN YEAR DEMOC3 DEMOC.Centered
<dbl> <dbl> <dbl> <dbl>
1 42 1976 1 -5.11
2 42 1977 1 -5.11
3 42 1978 6 -0.111
4 42 1979 6 -0.111
5 42 1980 6 -0.111
6 42 1981 6 -0.111
7 42 1982 7 0.889
8 42 1983 7 0.889
9 42 1984 7 0.889
10 42 1985 7 0.889
11 42 1986 7 0.889
12 42 1987 7 0.889
13 42 1988 7 0.889
14 42 1989 7 0.889
15 42 1990 7 0.889
16 42 1991 7 0.889
17 42 1992 7 0.889
18 42 1993 7 0.889
Big picture: Models for Single Time Series
ESSSSDA25-2W: Introduction and Summary